Technical Stock Price Analysis with Exponential Moving Average (EMA) and Forecasting using ARIMA Model

¶

_____________________________________________________

Prepared analysis includes two models: (1) Exponentially Weighted Moving Average (EMA) Analysis and (2) ARIMA. The first analysis model is intended to produce buy/sell signals based on crossovers compared to a historical average and the second model is trained to predict stock price fluctuations for the specified number of observation points.

1. Data and Descriptive Statistics¶

  • The data for this study consist of daily prices of the 6 stocks from different sectors: Health Care, Technology and Food. Stock included in the analysis are the following: Pfizer, Inc. (PFE); Nasdaq, Inc. (NSDAQ); Apple, Inc. (AAPL); Nestle SA (NSRGF); AstraZeneca Plc. (AZN); Coca-Cola, Inc. (KO).
  • Time period: from 2012 01 to 2021 12.
  • Data were collected from Nasdaq US website (https://www.nasdaq.com/)
In [1]:
import pandas as pd
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdt
import seaborn as sns
import mplfinance
from pylab import rcParams
%matplotlib inline
import os
import warnings
warnings.filterwarnings('ignore')
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.arima_model import ARIMA
from pmdarima.arima import auto_arima
from sklearn.metrics import mean_squared_error, mean_absolute_error
import math
In [2]:
Prices = pd.read_csv("C:\\...\\Closing prices 2012-2021.csv", parse_dates=['Date'])
Prices['Date']= pd.to_datetime(Prices['Date'])
Prices = Prices.sort_values(by="Date")
In [3]:
Prices.set_index('Date', inplace = True)
Prices.index.name = None
Prices
Out[3]:
Close PFE USD Close NSDAQ USD Close NSRGF USD Close AAPL USD Close AZN USD Close KO USD
2012-01-03 21.48 24.96 58.25 14.69 23.86 35.07
2012-01-04 21.28 24.62 57.75 14.77 23.74 34.85
2012-01-05 21.11 24.66 57.20 14.93 23.43 34.69
2012-01-06 21.08 24.43 56.60 15.09 23.49 34.47
2012-01-09 21.33 24.33 57.41 15.06 23.29 34.47
... ... ... ... ... ... ...
2021-11-26 54.00 203.68 131.31 156.81 56.58 53.73
2021-11-29 52.40 209.08 131.30 160.24 55.53 54.58
2021-11-30 53.73 203.23 129.00 165.30 54.83 52.45
2021-12-01 54.68 199.03 127.57 164.77 54.88 52.30
2021-12-02 53.04 201.23 128.48 163.76 54.79 53.07

2497 rows × 6 columns

1.1 Described Data¶

Descriptive statistics can be found below in the table. Table represents minimum (min) and maximum (max) price of every selected stock; total number of observations (count); and percentage distribution of the data; the mean that is the estimated central value of a group of numbers (mean); and standard deviation that quantifies the variation (or dispersion) of dataset.

Total number of observations is 2497. The largest standard deviation can be seen in NSDAQ stock price dataset which indicates that NSDAQ price changed the most over a specified period of time.

In [4]:
Prices.describe()
Out[4]:
Close PFE USD Close NSDAQ USD Close NSRGF USD Close AAPL USD Close AZN USD Close KO USD
count 2497.000000 2497.000000 2497.000000 2497.000000 2497.000000 2497.000000
mean 33.252002 75.290224 84.917145 48.040521 36.423244 44.604537
std 5.767063 42.992460 18.819134 37.214999 10.341610 5.607982
min 20.480000 21.240000 55.800000 13.950000 20.020000 33.500000
25% 29.640000 39.920000 72.850000 23.610000 29.040000 40.600000
50% 33.280000 68.340000 77.600000 32.190000 34.280000 43.510000
75% 36.320000 95.130000 99.040000 53.060000 41.270000 47.630000
max 54.680000 212.830000 135.640000 165.300000 63.830000 60.130000

2. Exponentially Weighted Moving Average (EMA) Analysis by Sector¶

Moving Average is a simple form technical analysis used as one of stock trading strategies. The average is taken over a specific period of time selected by trader; it flattens price trends by filtering out the “noise” from random short-term price fluctuations and helps to predict when stock should be bought/sold (produce buy/sell signals based on crossovers-divergences from the calculated historical average.

In this analysis, Exponentially Weighted Moving Average was selected over a Simple Moving Average as it gives more weight and significance to the recent data point. This means that EMA reacts more to recent price changes than a simple moving average that considers all data points as equal. Because of latter reason, EMA reacts quicker to the fluctuations in data than SMA.

For the analysis, long and short term EMA was calculated. For short EMA calculation 20 days were used; for long term EMA calculation 200 days were used (numbers were selected as a common practice after analyzing several publications towards Moving Average methodology).

Exponentially Weighted Moving Average is estimated as per below:

**PRICE(T) × K + EMA(Y) × (1 − K)**

where:

  • T = today
  • Y = yesterday
  • N = number of days in EMA
  • K = 2 ÷ (N + 1) - the multiplier

Source: https://www.dailyfx.com/education/moving-averages/ema-exponential-moving-average.html

2.1 Healtch Care Sector: Short & Long Term EMA¶

Health care sector companies analized: Pfizer, Inc (PFE) and AstraZeneca, Plc (AZN)

(1) Calculating short and long term Exponentially Weighted Moving Average for PFE:

In [5]:
PFE_short_term_EMA = Prices['Close PFE USD'].ewm(span=20, adjust=False).mean()
PFE_long_term_EMA = Prices['Close PFE USD'].ewm(span=200, adjust=False).mean()

As already can be seen from the 1st graph, long term moving average line is smoother than short term moving average line. It is because long term EMA flattens price swings more over a greater time period used in calculations. Long term EMA is used to see how stock is acting over a year, it helps to see the trend, while short term EMA is preferred to short-term swing trading.

The general simplified rule for using moving average as trading strategy is that as long as price stays above the exponential moving average, higher prices should be expected further. In opposite, as long as price is below the moving average, lower prices should be expected further as well. And the change should occur when price line crosses the EMA line in the graph. However, this is a simplified trading strategy and there are many more. such as buy/sell signals are generated when short term moving average crosses over omg term moving average and etc.

In [6]:
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close PFE USD'], "cadetblue", label='PFE Close')
ax[0].plot(PFE_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close PFE USD'], "cadetblue", label='PFE Close')
ax[1].plot(PFE_short_term_EMA, 'indianred', label='Short-term EMA')

ax[0].set_title('PFE Long-Term EMA')
ax[1].set_title('PFE Short-Term EMA')

ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)

plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")

For PFE a clear up-trend is seen in both graphs which means that higher peaks are expected.

(2) Calculating short and long term Exponentially Weighted Moving Average for AZN:

In [7]:
AZN_short_term_EMA = Prices['Close AZN USD'].ewm(span=20, adjust=False).mean()
AZN_long_term_EMA = Prices['Close AZN USD'].ewm(span=200, adjust=False).mean()
In [8]:
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close AZN USD'], "cadetblue", label='AZN Close')
ax[0].plot(AZN_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close AZN USD'], "cadetblue", label='AZN Close')
ax[1].plot(AZN_short_term_EMA, 'darkorchid', label='Short-term EMA')

ax[0].set_title('AZN Long-Term EMA')
ax[1].set_title('AZN Short-Term EMA')

ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)

plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")

For AZN a beginning of the down-trend is seen in both graphs which means that lower peaks and lower troughs over time are expected.

2.2 Technology Sector: Short & Long Term EMA¶

Technology sector companies analized: Nasdaq, Inc. (NSDAQ) and Apple, Inc. (AAPL)

(1) Calculating short and long term Exponentially Weighted Moving Average for NSDAQ:

In [9]:
NSDAQ_short_term_EMA = Prices['Close NSDAQ USD'].ewm(span=20, adjust=False).mean()
NSDAQ_long_term_EMA = Prices['Close NSDAQ USD'].ewm(span=250, adjust=False).mean()
In [10]:
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close NSDAQ USD'], "slategrey", label='NSDAQ Close')
ax[0].plot(NSDAQ_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close NSDAQ USD'], "slategrey", label='NSDAQ Close')
ax[1].plot(NSDAQ_short_term_EMA, 'darkorchid', label='Short-term EMA')


ax[0].set_title('NSDAQ Long-Term EMA')
ax[1].set_title('NSDAQ Short-Term EMA')

ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)

plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")

For NSDAQ a clear up-trend is seen in long term EMA trend line which means that higher peaks are expected in long-term perspective, however, a down-trend is seen in short term EMA.

(2) Calculating short and long term Exponentially Weighted Moving Average for AAPL:

In [11]:
AAPL_short_term_EMA = Prices['Close AAPL USD'].ewm(span=20, adjust=False).mean()
AAPL_long_term_EMA = Prices['Close AAPL USD'].ewm(span=200, adjust=False).mean()
In [12]:
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close AAPL USD'], "slategrey", label='AAPL Close')
ax[0].plot(AAPL_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close AAPL USD'], "slategrey", label='AAPL Close')
ax[1].plot(AAPL_short_term_EMA, 'darkorchid', label='Short-term EMA')

ax[0].set_title('AAPL Long-Term EMA')
ax[1].set_title('AAPL Short-Term EMA')

ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)

plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")

For APPL a clear up-trend is seen in both graphs which means that higher peaks are expected.

2.3 Food Sector: Short & Long Term EMA¶

Food sector companies analized: Nestle SA (NSRGF) and Coca-Cola, Inc. (KO)

(1) Calculating short and long term Exponentially Weighted Moving Average for NSRGF:

In [13]:
NSRGF_short_term_EMA = Prices['Close NSRGF USD'].ewm(span=20, adjust=False).mean()
NSRGF_long_term_EMA = Prices['Close NSRGF USD'].ewm(span=200, adjust=False).mean()
In [14]:
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close NSRGF USD'], "darkseagreen", label='NSRGF Close')
ax[0].plot(NSRGF_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close NSRGF USD'], "darkseagreen", label='NSRGF Close')
ax[1].plot(NSRGF_short_term_EMA, 'darkorchid', label='Short-term EMA')

ax[0].set_title('NSRGF Long-Term EMA')
ax[1].set_title('NSRGF Short-Term EMA')

ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)

plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")

Similarly as for NSDAQ, in NSRGF graph a clear up-trend is seen in long term EMA trend line which means that higher peaks are expected in long-term perspective, while a down-trend is seen in short term EMA.

(2) Calculating short and long term Exponentially Weighted Moving Average for KO:

In [15]:
KO_short_term_EMA = Prices['Close KO USD'].ewm(span=20, adjust=False).mean()
KO_long_term_EMA = Prices['Close KO USD'].ewm(span=200, adjust=False).mean()
In [16]:
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close KO USD'], "darkseagreen", label='KO Close')
ax[0].plot(KO_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close KO USD'], "darkseagreen", label='KO Close')
ax[1].plot(KO_short_term_EMA, 'darkorchid', label='Short-term EMA')

ax[0].set_title('KO Long-Term EMA')
ax[1].set_title('KO Short-Term EMA')

ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)

plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")

Once again, similarly as for NSDAQ and NSRGF, in KO graph an up-trend is seen in long term EMA trend line but a down-trend is seen in short term EMA. However, long term EMA up-trend line is not that assure as for NSDAQ. When a stock price crosses its 200-day EMA, it is a technical signal that a reversal has occurred (seen in the KO Long-Term EMA graph).

2.4 Interim Conclusions¶

Exponentially Weighted Moving Average is used to identify the predominant trend and patterns in the market as EMA reduces a noise of everyday price fluctuations. There are many trading strategies applied by using EMA, the most straightforward one is as per below:

(1) A long position should be held as long as the price timeseries is above the EMA line; (2) and a short position should be realized as long as the price timeseries is below the EMA line (either short term EMA or long term EMA, depending on the aims of the investing/trading).

Sources:

  • https://tradingstrategyguides.com/exponential-moving-average-strategy/
  • https://tradeciety.com/how-to-use-moving-averages/

3. Stock Price Foreacsting Autoregressive Integrated Moving Average: AZN and NSDAQ¶

Checking AZN and NSDAQ prices over period 2012 01-2021 12:

In [17]:
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close AZN USD'], "cadetblue", label='AZN Close')
ax[0].set_title('AZN Price 2012 01-2021 12')
ax[1].plot(Prices['Close NSDAQ USD'], "slategrey", label='NSDAQ Close')
ax[1].set_title('NSDAQ Price 2012 01-2021 12')

ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)

plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")

Data used for ARIMA should be stationary; from above graph it is seen that data is non-stationary. However, to confirm this assumption, Augmented Dickey-Fuller test can be performed.

Hypothesis for ADF test:

*H0: Time series non-stationary*

*H1: Time series stationary*

Checking if AZN and NSDAQ time series data is normally distributed by using Augmented Dickey-Fuller test:

In [18]:
print("ADF p-value AZN:", adfuller(Prices['Close AZN USD'])[1])
print("ADF p-value NSDAQ:", adfuller(Prices['Close NSDAQ USD'])[1])
ADF p-value AZN: 0.7996286742279053
ADF p-value NSDAQ: 1.0

Since p-value for both stocks is greater than 0.05, H0 is accepted, time series are non-stationary and should be decomposed in order to build reliable ARIMA model.

3.1 Time Series Decomposition¶

Seasonality and trend in AZN and NSDAQ prices should be separated from series. From below graph we can see that data has an upward seasonality for both stocks.

In [19]:
AZN_seasonality = seasonal_decompose(Prices['Close AZN USD'], model='additive', freq = 60)
fig = plt.figure()
fig = AZN_seasonality.plot().set_size_inches(16, 9)
<Figure size 432x288 with 0 Axes>
In [20]:
NSDAQ_seasonality = seasonal_decompose(Prices['Close NSDAQ USD'], model='additive', freq = 60)
fig = plt.figure()
fig = NSDAQ_seasonality.plot().set_size_inches(16, 9)
<Figure size 432x288 with 0 Axes>

To make data stationary, a log of the series should be made. According to methodology, after logging values, the rolling average of 12 months shuold be calculated.

Logging AZN and NSDAQ values and calculating 1 year rolling average:

In [21]:
rcParams['figure.figsize'] = 16, 9
AZN_log = np.log(Prices['Close AZN USD'])
MA = AZN_log.rolling(12).mean()
STD = AZN_log.rolling(12).std()

rcParams['figure.figsize'] = 16, 9
NSDAQ_log = np.log(Prices['Close NSDAQ USD'])
MA = NSDAQ_log.rolling(12).mean()
STD = NSDAQ_log.rolling(12).std()

As Time Series are made stationary after logging values and calculating rolling average, ARIMA model can be built.

3.2 Building ARIMA Model: AZN and NSDAQ¶

First of all, data should be split for training and for testing. As seen from below, 70% of a dataset was selected for traning and the rest 30% for testing for AZN; while 90% selected for training and 10% for testing for NSDAQ.

In [22]:
train_AZN, test_AZN = AZN_log[0:int(len(AZN_log)*0.70)], AZN_log[int(len(AZN_log)*0.70):]
train_NSDAQ, test_NSDAQ = NSDAQ_log[0:int(len(NSDAQ_log)*0.90)], NSDAQ_log[int(len(NSDAQ_log)*0.90):]

plt.figure(figsize=(16,9))
plt.plot(AZN_log, 'cadetblue', label='AZN Train')
plt.plot(test_AZN, 'indianred', label='AZN Test')

plt.plot(NSDAQ_log, 'slategrey', label='NSDAQ Train')
plt.plot(test_NSDAQ, 'sandybrown', label='NSDAQ Test')
plt.legend(loc='upper left', frameon=False)
Out[22]:
<matplotlib.legend.Legend at 0x26a511e96a0>

ARIMA model contains p, d, q parameters that should be defined. For this analysis model, auto ARIMA is used to predict required parameters automatically. In order to build more accurate model, p, d, q should be evaluated more carefully by applying additional tests.

In [23]:
ARIMA_Model = auto_arima(train_AZN, start_p=0,start_q=0,test='adf', max_p=3, max_q=3, m=1, d=None, seasonal=False, start_P=0, D=0, 
              trace=True, error_action='ignore',suppress_warnings=True, stepwise=True)
Performing stepwise search to minimize aic
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=-9968.176, Time=0.21 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=-9967.525, Time=0.23 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=-9967.563, Time=0.20 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=-9969.487, Time=0.15 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=-9965.028, Time=0.55 sec

Best model:  ARIMA(0,1,0)(0,0,0)[0]          
Total fit time: 1.355 seconds

Auto ARIMA selected p, d, q values that is best fit for the AZN dataset.

p = 0

d = 1

q = 0

Since values are known, they can be used in ARIMA model for AZN.

In [24]:
AZN_ARIMA_Model = ARIMA(train_AZN, order=(0,1,0))
fitted_from_above_AZN= AZN_ARIMA_Model.fit()
fitted_from_above_AZN.summary()
C:\Users\Simona\anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:581: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.
  warnings.warn('A date index has been provided, but it has no'
C:\Users\Simona\anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:581: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.
  warnings.warn('A date index has been provided, but it has no'
Out[24]:
ARIMA Model Results
Dep. Variable: D.Close AZN USD No. Observations: 1746
Model: ARIMA(0, 1, 0) Log Likelihood 4986.088
Method: css S.D. of innovations 0.014
Date: Sun, 05 Dec 2021 AIC -9968.176
Time: 23:17:22 BIC -9957.246
Sample: 1 HQIC -9964.135
coef std err z P>|z| [0.025 0.975]
const 0.0003 0.000 0.830 0.407 -0.000 0.001
In [25]:
ARIMA_Model = auto_arima(train_NSDAQ, start_p=0,start_q=0,test='adf', max_p=3, max_q=3, m=1, d=None, seasonal=False, start_P=0, D=0, 
              trace=True, error_action='ignore',suppress_warnings=True, stepwise=True)
Performing stepwise search to minimize aic
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=-12533.901, Time=0.30 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=-12543.093, Time=0.33 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=-12542.377, Time=0.26 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=-12530.647, Time=0.12 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=-12543.945, Time=0.34 sec
 ARIMA(3,1,0)(0,0,0)[0] intercept   : AIC=-12547.461, Time=0.69 sec
 ARIMA(3,1,1)(0,0,0)[0] intercept   : AIC=-12554.554, Time=2.56 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=-12541.666, Time=2.30 sec
 ARIMA(3,1,2)(0,0,0)[0] intercept   : AIC=-12547.410, Time=0.60 sec
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=-12539.425, Time=1.60 sec
 ARIMA(3,1,1)(0,0,0)[0]             : AIC=-12541.874, Time=0.46 sec

Best model:  ARIMA(3,1,1)(0,0,0)[0] intercept
Total fit time: 9.570 seconds

Auto ARIMA selected p, d, q values that is best fit for the NSDAQ dataset.

p = 3

d = 1

q = 1

Since values are known, they can be used in ARIMA model for NSDAQ.

In [26]:
NSDAQ_ARIMA_Model = ARIMA(train_NSDAQ, order=(3,1,1))
fitted_from_above_NSDAQ= NSDAQ_ARIMA_Model.fit()
fitted_from_above_NSDAQ.summary()
C:\Users\Simona\anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:581: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.
  warnings.warn('A date index has been provided, but it has no'
C:\Users\Simona\anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:581: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.
  warnings.warn('A date index has been provided, but it has no'
Out[26]:
ARIMA Model Results
Dep. Variable: D.Close NSDAQ USD No. Observations: 2246
Model: ARIMA(3, 1, 1) Log Likelihood 6283.278
Method: css-mle S.D. of innovations 0.015
Date: Sun, 05 Dec 2021 AIC -12554.556
Time: 23:17:32 BIC -12520.254
Sample: 1 HQIC -12542.035
coef std err z P>|z| [0.025 0.975]
const 0.0007 0.000 2.741 0.006 0.000 0.001
ar.L1.D.Close NSDAQ USD 0.4335 0.121 3.584 0.000 0.196 0.671
ar.L2.D.Close NSDAQ USD 0.0649 0.024 2.655 0.008 0.017 0.113
ar.L3.D.Close NSDAQ USD -0.0876 0.021 -4.154 0.000 -0.129 -0.046
ma.L1.D.Close NSDAQ USD -0.5023 0.120 -4.183 0.000 -0.738 -0.267
Roots
Real Imaginary Modulus Frequency
AR.1 1.7133 -1.1476j 2.0621 -0.0939
AR.2 1.7133 +1.1476j 2.0621 0.0939
AR.3 -2.6858 -0.0000j 2.6858 -0.5000
MA.1 1.9910 +0.0000j 1.9910 0.0000

Alpha for forecasting is selected 0.05 for confidence interval 95%

In [27]:
fc1, se, conf = fitted_from_above_AZN.forecast(750, alpha=0.05) #error dėl index, turi būti 750?
In [28]:
forecast_series_AZN = pd.Series(fc1, index=test_AZN.index)
lower_series_AZN = pd.Series(conf[:, 0], index=test_AZN.index)
upper_series_AZN = pd.Series(conf[:, 0], index=test_AZN.index)
In [29]:
fc2, se, conf = fitted_from_above_NSDAQ.forecast(250, alpha=0.05) #error dėl index, turi būti 250?
In [30]:
forecast_series_NSDAQ = pd.Series(fc2, index=test_NSDAQ.index)
lower_series_NSDAQ = pd.Series(conf[:, 0], index=test_NSDAQ.index)
upper_series_NSDAQ = pd.Series(conf[:, 0], index=test_NSDAQ.index)
In [31]:
plt.plot(train_AZN, 'cadetblue', label='AZN Historical data')
plt.plot(test_AZN, color = 'indianred', label='AZN Stock Price')
plt.plot(forecast_series_AZN, color = 'green')

plt.plot(train_NSDAQ, 'slategrey', label='NSDAQ Historical data')
plt.plot(test_NSDAQ, color = 'sandybrown', label='NSDAQ Stock Price')
plt.plot(forecast_series_NSDAQ, color = 'green',label='Prediction')

plt.legend(loc='upper left', frameon=False)
Out[31]:
<matplotlib.legend.Legend at 0x26a50d5fb80>

Green line in the above graph shows forecasted trend line that can be compared to the actual NSDAQ and AZN prices. Graph suggests that prediction is quite accurate, trend line is upward as both stock (AZN and NSDAQ) prices are gradually increasing.

Moreover, some additional tests can be used to check if model is acceptable. According to methodology, if MAPE ratio is around 2.5 %, then ARIMA can be used for predicting future prices.

Checking MAPE for AZN and NSDAQ ARIMA:

In [32]:
MAPE_AZN = np.mean(np.abs(fc1 - test_AZN)/np.abs(test_AZN))
MAPE_NSDAQ = np.mean(np.abs(fc2 - test_NSDAQ)/np.abs(test_NSDAQ))
print('MAPE AZN: '+str(MAPE_AZN))
print('MAPE NSDAQ: '+str(MAPE_NSDAQ))
MAPE AZN: 0.03651058508201623
MAPE NSDAQ: 0.03846177730169523

As MAPE is hihgher than 2.5%, therefore, the model cannot be considered as the most reliable one. However, MAPE is still less than 10%, which produces very good result.

Conclusions¶

  • Moving Average technical analysis is widely adopted in many trading strategies. Exponentially Weighted Moving Average gives more weight and significance to the recent data point than Simple Moving Average, which means that EMA reacts more to recent price changes and has lower lag.
  • Moving average can be calculated selected period of time. Most used ones are 20 or 50 days moving average (for short term trading) and 200 or 250 (for long term trading). As seen from the analysis graphs, longer period EMA is smoother as it reduces noise and flattens price fluctuations. Long term EMA reveals trend and price swing patterns.
  • If EMA has a down-trend, it means that lower peaks and lower troughs over time are expected (in this analysis such trend is seen for long and short term AZN EMA, for short term NSDAQ EMA, short term NSRGF EMA, short term KO EMA); if EMA has a up-trend, it indicates that higher peaks are expected (in this analysis such clear trend is seen for long and short AAPL EMA).
  • ARIMA model is used to predict future trends based on historical data. In this analysis model, ARIMA was created for NSDAQ and AZN stocks; data was trained and tested.
  • ARIMA graph suggests that prediction is quite accurate, given trend line is upward as both stock (AZN and NSDAQ) prices are gradually increasing.
  • Since dataset is non-stationary, it was decomposed by logging values. Final ARIMA models' MAPE for AZN is 3.65% and for NSDAQ is 3.84%. The result is quite reliable but more investigation should be dedicated to ARIMA variables (p, d, q) and data decomposition.

Analysis Limitations¶

  • For EMA: it is unclear if more emphasis should be placed on the most recent days for long term EMA. Some sources identify that overweighting recent dates creates a bias that leads to more false alarms.
  • Technical analysis is limited as it does not include other factors that cause price fluctuations, only historical data. Fundamental and sentiment analysis should be made in parallel for most plausible results.
  • ARIMA p, d, q was auto calculated, index error for forecasting was received.

For Further Analysis¶

  • Investigate trading strategies with EMA by using both short and long term crossovers.
  • Better analyze p, d, q for ARIMA and try several data decompostion methods to get more suitable MAPE.
  • Apply seasonality for ARIMA model and compose prediction function.
  • Apply SARIMA or SARIMAX forecasting models.
In [ ]: